Overview

Brought to you by YData

Dataset statistics

Number of variables24
Number of observations51290
Missing cells41296
Missing cells (%)3.4%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory9.4 MiB
Average record size in memory192.0 B

Variable types

Numeric7
Text8
DateTime2
Categorical7

Alerts

Category is highly overall correlated with Sub-CategoryHigh correlation
Discount is highly overall correlated with ProfitHigh correlation
Market is highly overall correlated with Postal Code and 2 other fieldsHigh correlation
Postal Code is highly overall correlated with Market and 1 other fieldsHigh correlation
Profit is highly overall correlated with DiscountHigh correlation
Region is highly overall correlated with Market and 2 other fieldsHigh correlation
Row ID is highly overall correlated with Market and 1 other fieldsHigh correlation
Sales is highly overall correlated with Shipping CostHigh correlation
Shipping Cost is highly overall correlated with SalesHigh correlation
Sub-Category is highly overall correlated with CategoryHigh correlation
Postal Code has 41296 (80.5%) missing values Missing
Row ID is uniformly distributed Uniform
Row ID has unique values Unique
Discount has 29009 (56.6%) zeros Zeros
Profit has 668 (1.3%) zeros Zeros

Reproduction

Analysis started2025-09-25 19:00:10.680235
Analysis finished2025-09-25 19:00:18.952603
Duration8.27 seconds
Software versionydata-profiling vv4.16.1
Download configurationconfig.json

Variables

Row ID
Real number (ℝ)

High correlation  Uniform  Unique 

Distinct51290
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean25645.5
Minimum1
Maximum51290
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size400.8 KiB
2025-09-25T21:00:19.037243image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile2565.45
Q112823.25
median25645.5
Q338467.75
95-th percentile48725.55
Maximum51290
Range51289
Interquartile range (IQR)25644.5

Descriptive statistics

Standard deviation14806.292
Coefficient of variation (CV)0.57734464
Kurtosis-1.2
Mean25645.5
Median Absolute Deviation (MAD)12822.5
Skewness-6.0069466 × 10-18
Sum1.3153577 × 109
Variance2.1922628 × 108
MonotonicityNot monotonic
2025-09-25T21:00:19.137838image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
6147 1
 
< 0.1%
32298 1
 
< 0.1%
26341 1
 
< 0.1%
25330 1
 
< 0.1%
13524 1
 
< 0.1%
47221 1
 
< 0.1%
22732 1
 
< 0.1%
30570 1
 
< 0.1%
31192 1
 
< 0.1%
40155 1
 
< 0.1%
Other values (51280) 51280
> 99.9%
ValueCountFrequency (%)
1 1
< 0.1%
2 1
< 0.1%
3 1
< 0.1%
4 1
< 0.1%
5 1
< 0.1%
6 1
< 0.1%
7 1
< 0.1%
8 1
< 0.1%
9 1
< 0.1%
10 1
< 0.1%
ValueCountFrequency (%)
51290 1
< 0.1%
51289 1
< 0.1%
51288 1
< 0.1%
51287 1
< 0.1%
51286 1
< 0.1%
51285 1
< 0.1%
51284 1
< 0.1%
51283 1
< 0.1%
51282 1
< 0.1%
51281 1
< 0.1%
Distinct25035
Distinct (%)48.8%
Missing0
Missing (%)0.0%
Memory size400.8 KiB
2025-09-25T21:00:19.490620image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Length

Max length15
Median length14
Mean length13.569643
Min length9

Characters and Unicode

Total characters695987
Distinct characters37
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique12257 ?
Unique (%)23.9%

Sample

1st rowCA-2012-124891
2nd rowIN-2013-77878
3rd rowIN-2013-71249
4th rowES-2013-1579342
5th rowSG-2013-4320
ValueCountFrequency (%)
ca-2014-100111 14
 
< 0.1%
to-2014-9950 13
 
< 0.1%
ni-2014-8880 13
 
< 0.1%
in-2012-41261 13
 
< 0.1%
in-2013-42311 13
 
< 0.1%
mx-2014-166541 13
 
< 0.1%
in-2011-76625 12
 
< 0.1%
mx-2013-142678 12
 
< 0.1%
in-2014-15263 12
 
< 0.1%
mx-2013-127705 12
 
< 0.1%
Other values (25025) 51163
99.8%
2025-09-25T21:00:19.997253image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 108762
15.6%
- 102580
14.7%
2 90147
13.0%
0 85011
12.2%
4 45213
 
6.5%
3 41592
 
6.0%
5 27294
 
3.9%
6 25799
 
3.7%
7 23086
 
3.3%
8 22626
 
3.3%
Other values (27) 123877
17.8%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 490827
70.5%
Dash Punctuation 102580
 
14.7%
Uppercase Letter 102580
 
14.7%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
I 15332
14.9%
S 13869
13.5%
A 9679
9.4%
C 9196
9.0%
N 8691
8.5%
E 8590
8.4%
M 8499
8.3%
X 7644
7.5%
U 6508
6.3%
T 3748
 
3.7%
Other values (16) 10824
10.6%
Decimal Number
ValueCountFrequency (%)
1 108762
22.2%
2 90147
18.4%
0 85011
17.3%
4 45213
9.2%
3 41592
 
8.5%
5 27294
 
5.6%
6 25799
 
5.3%
7 23086
 
4.7%
8 22626
 
4.6%
9 21297
 
4.3%
Dash Punctuation
ValueCountFrequency (%)
- 102580
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 593407
85.3%
Latin 102580
 
14.7%

Most frequent character per script

Latin
ValueCountFrequency (%)
I 15332
14.9%
S 13869
13.5%
A 9679
9.4%
C 9196
9.0%
N 8691
8.5%
E 8590
8.4%
M 8499
8.3%
X 7644
7.5%
U 6508
6.3%
T 3748
 
3.7%
Other values (16) 10824
10.6%
Common
ValueCountFrequency (%)
1 108762
18.3%
- 102580
17.3%
2 90147
15.2%
0 85011
14.3%
4 45213
7.6%
3 41592
 
7.0%
5 27294
 
4.6%
6 25799
 
4.3%
7 23086
 
3.9%
8 22626
 
3.8%

Most occurring blocks

ValueCountFrequency (%)
ASCII 695987
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 108762
15.6%
- 102580
14.7%
2 90147
13.0%
0 85011
12.2%
4 45213
 
6.5%
3 41592
 
6.0%
5 27294
 
3.9%
6 25799
 
3.7%
7 23086
 
3.3%
8 22626
 
3.3%
Other values (27) 123877
17.8%
Distinct1430
Distinct (%)2.8%
Missing0
Missing (%)0.0%
Memory size400.8 KiB
Minimum2011-01-01 00:00:00
Maximum2014-12-31 00:00:00
Invalid dates0
Invalid dates (%)0.0%
2025-09-25T21:00:20.191559image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-09-25T21:00:20.393003image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
Distinct1464
Distinct (%)2.9%
Missing0
Missing (%)0.0%
Memory size400.8 KiB
Minimum2011-01-02 00:00:00
Maximum2015-07-01 00:00:00
Invalid dates0
Invalid dates (%)0.0%
2025-09-25T21:00:20.508833image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-09-25T21:00:20.630044image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram with fixed size bins (bins=50)

Ship Mode
Categorical

Distinct4
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size400.8 KiB
Standard Class
30775 
Second Class
10309 
First Class
7505 
Same Day
 
2701

Length

Max length14
Median length14
Mean length12.843069
Min length8

Characters and Unicode

Total characters658721
Distinct characters18
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowSame Day
2nd rowSecond Class
3rd rowFirst Class
4th rowFirst Class
5th rowSame Day

Common Values

ValueCountFrequency (%)
Standard Class 30775
60.0%
Second Class 10309
 
20.1%
First Class 7505
 
14.6%
Same Day 2701
 
5.3%

Length

2025-09-25T21:00:20.797684image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2025-09-25T21:00:20.893479image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
ValueCountFrequency (%)
class 48589
47.4%
standard 30775
30.0%
second 10309
 
10.0%
first 7505
 
7.3%
same 2701
 
2.6%
day 2701
 
2.6%

Most occurring characters

ValueCountFrequency (%)
a 115541
17.5%
s 104683
15.9%
d 71859
10.9%
51290
7.8%
C 48589
7.4%
l 48589
7.4%
S 43785
 
6.6%
n 41084
 
6.2%
t 38280
 
5.8%
r 38280
 
5.8%
Other values (8) 56741
8.6%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 504851
76.6%
Uppercase Letter 102580
 
15.6%
Space Separator 51290
 
7.8%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 115541
22.9%
s 104683
20.7%
d 71859
14.2%
l 48589
9.6%
n 41084
 
8.1%
t 38280
 
7.6%
r 38280
 
7.6%
e 13010
 
2.6%
c 10309
 
2.0%
o 10309
 
2.0%
Other values (3) 12907
 
2.6%
Uppercase Letter
ValueCountFrequency (%)
C 48589
47.4%
S 43785
42.7%
F 7505
 
7.3%
D 2701
 
2.6%
Space Separator
ValueCountFrequency (%)
51290
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 607431
92.2%
Common 51290
 
7.8%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 115541
19.0%
s 104683
17.2%
d 71859
11.8%
C 48589
8.0%
l 48589
8.0%
S 43785
 
7.2%
n 41084
 
6.8%
t 38280
 
6.3%
r 38280
 
6.3%
e 13010
 
2.1%
Other values (7) 43731
 
7.2%
Common
ValueCountFrequency (%)
51290
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 658721
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 115541
17.5%
s 104683
15.9%
d 71859
10.9%
51290
7.8%
C 48589
7.4%
l 48589
7.4%
S 43785
 
6.6%
n 41084
 
6.2%
t 38280
 
5.8%
r 38280
 
5.8%
Other values (8) 56741
8.6%
Distinct1590
Distinct (%)3.1%
Missing0
Missing (%)0.0%
Memory size400.8 KiB
2025-09-25T21:00:21.330241image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Length

Max length8
Median length8
Mean length7.8146227
Min length5

Characters and Unicode

Total characters400812
Distinct characters40
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique7 ?
Unique (%)< 0.1%

Sample

1st rowRH-19495
2nd rowJR-16210
3rd rowCR-12730
4th rowKM-16375
5th rowRH-9495
ValueCountFrequency (%)
po-18850 97
 
0.2%
be-11335 94
 
0.2%
jg-15805 90
 
0.2%
sw-20755 89
 
0.2%
em-13960 85
 
0.2%
my-18295 85
 
0.2%
mp-17965 84
 
0.2%
zc-21910 84
 
0.2%
ck-12205 83
 
0.2%
af-10870 81
 
0.2%
Other values (1580) 50418
98.3%
2025-09-25T21:00:21.889298image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 54738
13.7%
- 51290
12.8%
0 43087
 
10.7%
5 39862
 
9.9%
2 21594
 
5.4%
8 14778
 
3.7%
6 14703
 
3.7%
7 14670
 
3.7%
3 14635
 
3.7%
4 14487
 
3.6%
Other values (30) 116968
29.2%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 246942
61.6%
Uppercase Letter 102373
25.5%
Dash Punctuation 51290
 
12.8%
Lowercase Letter 207
 
0.1%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
M 9011
 
8.8%
C 8835
 
8.6%
S 8738
 
8.5%
B 8382
 
8.2%
D 6582
 
6.4%
J 6171
 
6.0%
A 5967
 
5.8%
H 5218
 
5.1%
P 5206
 
5.1%
R 4849
 
4.7%
Other values (16) 33414
32.6%
Decimal Number
ValueCountFrequency (%)
1 54738
22.2%
0 43087
17.4%
5 39862
16.1%
2 21594
 
8.7%
8 14778
 
6.0%
6 14703
 
6.0%
7 14670
 
5.9%
3 14635
 
5.9%
4 14487
 
5.9%
9 14388
 
5.8%
Lowercase Letter
ValueCountFrequency (%)
p 81
39.1%
o 68
32.9%
l 58
28.0%
Dash Punctuation
ValueCountFrequency (%)
- 51290
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 298232
74.4%
Latin 102580
 
25.6%

Most frequent character per script

Latin
ValueCountFrequency (%)
M 9011
 
8.8%
C 8835
 
8.6%
S 8738
 
8.5%
B 8382
 
8.2%
D 6582
 
6.4%
J 6171
 
6.0%
A 5967
 
5.8%
H 5218
 
5.1%
P 5206
 
5.1%
R 4849
 
4.7%
Other values (19) 33621
32.8%
Common
ValueCountFrequency (%)
1 54738
18.4%
- 51290
17.2%
0 43087
14.4%
5 39862
13.4%
2 21594
 
7.2%
8 14778
 
5.0%
6 14703
 
4.9%
7 14670
 
4.9%
3 14635
 
4.9%
4 14487
 
4.9%

Most occurring blocks

ValueCountFrequency (%)
ASCII 400812
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 54738
13.7%
- 51290
12.8%
0 43087
 
10.7%
5 39862
 
9.9%
2 21594
 
5.4%
8 14778
 
3.7%
6 14703
 
3.7%
7 14670
 
3.7%
3 14635
 
3.7%
4 14487
 
3.6%
Other values (30) 116968
29.2%
Distinct795
Distinct (%)1.6%
Missing0
Missing (%)0.0%
Memory size400.8 KiB
2025-09-25T21:00:22.338937image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Length

Max length22
Median length18
Mean length12.946227
Min length7

Characters and Unicode

Total characters664012
Distinct characters57
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowRick Hansen
2nd rowJustin Ritter
3rd rowCraig Reiter
4th rowKatherine Murray
5th rowRick Hansen
ValueCountFrequency (%)
michael 655
 
0.6%
john 522
 
0.5%
paul 438
 
0.4%
patrick 437
 
0.4%
tom 430
 
0.4%
stewart 426
 
0.4%
anthony 424
 
0.4%
frank 422
 
0.4%
alan 402
 
0.4%
bill 402
 
0.4%
Other values (901) 98324
95.6%
2025-09-25T21:00:22.861242image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 61455
 
9.3%
e 60921
 
9.2%
n 51801
 
7.8%
51592
 
7.8%
r 48359
 
7.3%
i 40342
 
6.1%
l 34229
 
5.2%
o 30793
 
4.6%
t 27197
 
4.1%
s 23187
 
3.5%
Other values (47) 234136
35.3%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 506250
76.2%
Uppercase Letter 105217
 
15.8%
Space Separator 51592
 
7.8%
Other Punctuation 728
 
0.1%
Dash Punctuation 225
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 61455
12.1%
e 60921
12.0%
n 51801
10.2%
r 48359
9.6%
i 40342
 
8.0%
l 34229
 
6.8%
o 30793
 
6.1%
t 27197
 
5.4%
s 23187
 
4.6%
h 19661
 
3.9%
Other values (18) 108305
21.4%
Uppercase Letter
ValueCountFrequency (%)
C 9426
 
9.0%
M 9185
 
8.7%
S 8738
 
8.3%
B 8677
 
8.2%
D 6780
 
6.4%
A 6305
 
6.0%
J 6171
 
5.9%
H 5434
 
5.2%
P 5206
 
4.9%
R 4970
 
4.7%
Other values (16) 34325
32.6%
Space Separator
ValueCountFrequency (%)
51592
100.0%
Other Punctuation
ValueCountFrequency (%)
' 728
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 225
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 611467
92.1%
Common 52545
 
7.9%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 61455
 
10.1%
e 60921
 
10.0%
n 51801
 
8.5%
r 48359
 
7.9%
i 40342
 
6.6%
l 34229
 
5.6%
o 30793
 
5.0%
t 27197
 
4.4%
s 23187
 
3.8%
h 19661
 
3.2%
Other values (44) 213522
34.9%
Common
ValueCountFrequency (%)
51592
98.2%
' 728
 
1.4%
- 225
 
0.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII 663596
99.9%
None 416
 
0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 61455
 
9.3%
e 60921
 
9.2%
n 51801
 
7.8%
51592
 
7.8%
r 48359
 
7.3%
i 40342
 
6.1%
l 34229
 
5.2%
o 30793
 
4.6%
t 27197
 
4.1%
s 23187
 
3.5%
Other values (44) 233720
35.2%
None
ValueCountFrequency (%)
ö 293
70.4%
ä 76
 
18.3%
ü 47
 
11.3%

Segment
Categorical

Distinct3
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size400.8 KiB
Consumer
26518 
Corporate
15429 
Home Office
9343 

Length

Max length11
Median length8
Mean length8.8472997
Min length8

Characters and Unicode

Total characters453778
Distinct characters17
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowConsumer
2nd rowCorporate
3rd rowConsumer
4th rowHome Office
5th rowConsumer

Common Values

ValueCountFrequency (%)
Consumer 26518
51.7%
Corporate 15429
30.1%
Home Office 9343
 
18.2%

Length

2025-09-25T21:00:22.952449image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2025-09-25T21:00:23.016423image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
ValueCountFrequency (%)
consumer 26518
43.7%
corporate 15429
25.4%
home 9343
 
15.4%
office 9343
 
15.4%

Most occurring characters

ValueCountFrequency (%)
o 66719
14.7%
e 60633
13.4%
r 57376
12.6%
C 41947
9.2%
m 35861
7.9%
u 26518
 
5.8%
s 26518
 
5.8%
n 26518
 
5.8%
f 18686
 
4.1%
p 15429
 
3.4%
Other values (7) 77573
17.1%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 383802
84.6%
Uppercase Letter 60633
 
13.4%
Space Separator 9343
 
2.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
o 66719
17.4%
e 60633
15.8%
r 57376
14.9%
m 35861
9.3%
u 26518
 
6.9%
s 26518
 
6.9%
n 26518
 
6.9%
f 18686
 
4.9%
p 15429
 
4.0%
a 15429
 
4.0%
Other values (3) 34115
8.9%
Uppercase Letter
ValueCountFrequency (%)
C 41947
69.2%
H 9343
 
15.4%
O 9343
 
15.4%
Space Separator
ValueCountFrequency (%)
9343
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 444435
97.9%
Common 9343
 
2.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
o 66719
15.0%
e 60633
13.6%
r 57376
12.9%
C 41947
9.4%
m 35861
8.1%
u 26518
 
6.0%
s 26518
 
6.0%
n 26518
 
6.0%
f 18686
 
4.2%
p 15429
 
3.5%
Other values (6) 68230
15.4%
Common
ValueCountFrequency (%)
9343
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 453778
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
o 66719
14.7%
e 60633
13.4%
r 57376
12.6%
C 41947
9.2%
m 35861
7.9%
u 26518
 
5.8%
s 26518
 
5.8%
n 26518
 
5.8%
f 18686
 
4.1%
p 15429
 
3.4%
Other values (7) 77573
17.1%

City
Text

Distinct3636
Distinct (%)7.1%
Missing0
Missing (%)0.0%
Memory size400.8 KiB
2025-09-25T21:00:23.504678image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Length

Max length35
Median length29
Mean length8.419302
Min length2

Characters and Unicode

Total characters431826
Distinct characters76
Distinct categories8 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique488 ?
Unique (%)1.0%

Sample

1st rowNew York City
2nd rowWollongong
3rd rowBrisbane
4th rowBerlin
5th rowDakar
ValueCountFrequency (%)
city 1789
 
2.8%
san 1671
 
2.6%
new 958
 
1.5%
york 950
 
1.5%
los 874
 
1.4%
angeles 751
 
1.2%
de 599
 
0.9%
francisco 557
 
0.9%
philadelphia 537
 
0.8%
santo 465
 
0.7%
Other values (3806) 54688
85.7%
2025-09-25T21:00:23.992083image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 55167
 
12.8%
n 32502
 
7.5%
e 31845
 
7.4%
o 30443
 
7.0%
i 27026
 
6.3%
r 23842
 
5.5%
l 21403
 
5.0%
s 16146
 
3.7%
t 15964
 
3.7%
u 15609
 
3.6%
Other values (66) 161879
37.5%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 354182
82.0%
Uppercase Letter 63411
 
14.7%
Space Separator 12549
 
2.9%
Dash Punctuation 1318
 
0.3%
Other Punctuation 356
 
0.1%
Open Punctuation 4
 
< 0.1%
Close Punctuation 4
 
< 0.1%
Control 2
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 55167
15.6%
n 32502
 
9.2%
e 31845
 
9.0%
o 30443
 
8.6%
i 27026
 
7.6%
r 23842
 
6.7%
l 21403
 
6.0%
s 16146
 
4.6%
t 15964
 
4.5%
u 15609
 
4.4%
Other values (31) 84235
23.8%
Uppercase Letter
ValueCountFrequency (%)
S 7463
11.8%
C 7203
11.4%
M 6059
 
9.6%
B 4731
 
7.5%
L 4204
 
6.6%
A 4133
 
6.5%
P 4011
 
6.3%
T 2652
 
4.2%
D 2554
 
4.0%
N 2447
 
3.9%
Other values (17) 17954
28.3%
Other Punctuation
ValueCountFrequency (%)
' 341
95.8%
. 8
 
2.2%
? 7
 
2.0%
Space Separator
ValueCountFrequency (%)
12549
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 1318
100.0%
Open Punctuation
ValueCountFrequency (%)
( 4
100.0%
Close Punctuation
ValueCountFrequency (%)
) 4
100.0%
Control
ValueCountFrequency (%)
’ 2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 417593
96.7%
Common 14233
 
3.3%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 55167
 
13.2%
n 32502
 
7.8%
e 31845
 
7.6%
o 30443
 
7.3%
i 27026
 
6.5%
r 23842
 
5.7%
l 21403
 
5.1%
s 16146
 
3.9%
t 15964
 
3.8%
u 15609
 
3.7%
Other values (58) 147646
35.4%
Common
ValueCountFrequency (%)
12549
88.2%
- 1318
 
9.3%
' 341
 
2.4%
. 8
 
0.1%
? 7
 
< 0.1%
( 4
 
< 0.1%
) 4
 
< 0.1%
’ 2
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 429396
99.4%
None 2430
 
0.6%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 55167
 
12.8%
n 32502
 
7.6%
e 31845
 
7.4%
o 30443
 
7.1%
i 27026
 
6.3%
r 23842
 
5.6%
l 21403
 
5.0%
s 16146
 
3.8%
t 15964
 
3.7%
u 15609
 
3.6%
Other values (49) 159449
37.1%
None
ValueCountFrequency (%)
á 645
26.5%
í 507
20.9%
ó 418
17.2%
é 290
11.9%
ã 261
10.7%
ú 89
 
3.7%
ü 55
 
2.3%
ç 52
 
2.1%
ñ 34
 
1.4%
Á 32
 
1.3%
Other values (7) 47
 
1.9%

State
Text

Distinct1094
Distinct (%)2.1%
Missing0
Missing (%)0.0%
Memory size400.8 KiB
2025-09-25T21:00:24.358553image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Length

Max length36
Median length26
Mean length9.6409826
Min length3

Characters and Unicode

Total characters494486
Distinct characters81
Distinct categories8 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique64 ?
Unique (%)0.1%

Sample

1st rowNew York
2nd rowNew South Wales
3rd rowQueensland
4th rowBerlin
5th rowDakar
ValueCountFrequency (%)
california 2125
 
3.1%
new 2103
 
3.1%
england 1499
 
2.2%
south 1201
 
1.8%
north 1145
 
1.7%
york 1128
 
1.7%
texas 985
 
1.4%
ile-de-france 981
 
1.4%
wales 817
 
1.2%
capital 784
 
1.1%
Other values (1189) 55429
81.3%
2025-09-25T21:00:24.959668image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 72974
14.8%
n 39413
 
8.0%
i 33410
 
6.8%
e 30993
 
6.3%
o 28369
 
5.7%
r 28361
 
5.7%
l 23890
 
4.8%
t 21263
 
4.3%
s 19509
 
3.9%
16907
 
3.4%
Other values (71) 179397
36.3%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 398796
80.6%
Uppercase Letter 71791
 
14.5%
Space Separator 16907
 
3.4%
Dash Punctuation 5834
 
1.2%
Other Punctuation 946
 
0.2%
Open Punctuation 103
 
< 0.1%
Close Punctuation 103
 
< 0.1%
Control 6
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 72974
18.3%
n 39413
9.9%
i 33410
 
8.4%
e 30993
 
7.8%
o 28369
 
7.1%
r 28361
 
7.1%
l 23890
 
6.0%
t 21263
 
5.3%
s 19509
 
4.9%
u 14632
 
3.7%
Other values (35) 85982
21.6%
Uppercase Letter
ValueCountFrequency (%)
C 7579
 
10.6%
S 6867
 
9.6%
A 5516
 
7.7%
N 5001
 
7.0%
M 4112
 
5.7%
P 4003
 
5.6%
B 3595
 
5.0%
T 3083
 
4.3%
W 3016
 
4.2%
F 2601
 
3.6%
Other values (17) 26418
36.8%
Other Punctuation
ValueCountFrequency (%)
' 764
80.8%
? 135
 
14.3%
. 47
 
5.0%
Control
ValueCountFrequency (%)
Š 4
66.7%
Ž 2
33.3%
Space Separator
ValueCountFrequency (%)
16907
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 5834
100.0%
Open Punctuation
ValueCountFrequency (%)
( 103
100.0%
Close Punctuation
ValueCountFrequency (%)
) 103
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 470587
95.2%
Common 23899
 
4.8%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 72974
15.5%
n 39413
 
8.4%
i 33410
 
7.1%
e 30993
 
6.6%
o 28369
 
6.0%
r 28361
 
6.0%
l 23890
 
5.1%
t 21263
 
4.5%
s 19509
 
4.1%
u 14632
 
3.1%
Other values (62) 157773
33.5%
Common
ValueCountFrequency (%)
16907
70.7%
- 5834
 
24.4%
' 764
 
3.2%
? 135
 
0.6%
( 103
 
0.4%
) 103
 
0.4%
. 47
 
0.2%
Š 4
 
< 0.1%
Ž 2
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 489990
99.1%
None 4496
 
0.9%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 72974
14.9%
n 39413
 
8.0%
i 33410
 
6.8%
e 30993
 
6.3%
o 28369
 
5.8%
r 28361
 
5.8%
l 23890
 
4.9%
t 21263
 
4.3%
s 19509
 
4.0%
16907
 
3.5%
Other values (49) 174901
35.7%
None
ValueCountFrequency (%)
é 935
20.8%
á 875
19.5%
í 714
15.9%
ô 672
14.9%
ã 473
10.5%
ó 291
 
6.5%
ü 260
 
5.8%
è 63
 
1.4%
à 58
 
1.3%
Á 30
 
0.7%
Other values (12) 125
 
2.8%
Distinct147
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Memory size400.8 KiB
2025-09-25T21:00:25.328814image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Length

Max length32
Median length22
Mean length8.8366738
Min length4

Characters and Unicode

Total characters453233
Distinct characters54
Distinct categories7 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowUnited States
2nd rowAustralia
3rd rowAustralia
4th rowGermany
5th rowSenegal
ValueCountFrequency (%)
united 11641
 
17.1%
states 9994
 
14.7%
australia 2837
 
4.2%
france 2827
 
4.2%
mexico 2644
 
3.9%
germany 2065
 
3.0%
china 1880
 
2.8%
kingdom 1633
 
2.4%
brazil 1599
 
2.3%
india 1555
 
2.3%
Other values (154) 29420
43.2%
2025-09-25T21:00:25.841296image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 55137
 
12.2%
e 43176
 
9.5%
i 40575
 
9.0%
t 40485
 
8.9%
n 36881
 
8.1%
d 21302
 
4.7%
r 20390
 
4.5%
s 18355
 
4.0%
16805
 
3.7%
o 14461
 
3.2%
Other values (44) 145666
32.1%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 368751
81.4%
Uppercase Letter 67287
 
14.8%
Space Separator 16805
 
3.7%
Open Punctuation 136
 
< 0.1%
Close Punctuation 136
 
< 0.1%
Other Punctuation 109
 
< 0.1%
Dash Punctuation 9
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 55137
15.0%
e 43176
11.7%
i 40575
11.0%
t 40485
11.0%
n 36881
10.0%
d 21302
 
5.8%
r 20390
 
5.5%
s 18355
 
5.0%
o 14461
 
3.9%
l 13507
 
3.7%
Other values (16) 64482
17.5%
Uppercase Letter
ValueCountFrequency (%)
S 13329
19.8%
U 12131
18.0%
I 5366
8.0%
A 4822
 
7.2%
C 4252
 
6.3%
M 3719
 
5.5%
F 2891
 
4.3%
G 2795
 
4.2%
N 2745
 
4.1%
B 2339
 
3.5%
Other values (13) 12898
19.2%
Space Separator
ValueCountFrequency (%)
16805
100.0%
Open Punctuation
ValueCountFrequency (%)
( 136
100.0%
Close Punctuation
ValueCountFrequency (%)
) 136
100.0%
Other Punctuation
ValueCountFrequency (%)
' 109
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 9
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 436038
96.2%
Common 17195
 
3.8%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 55137
12.6%
e 43176
 
9.9%
i 40575
 
9.3%
t 40485
 
9.3%
n 36881
 
8.5%
d 21302
 
4.9%
r 20390
 
4.7%
s 18355
 
4.2%
o 14461
 
3.3%
l 13507
 
3.1%
Other values (39) 131769
30.2%
Common
ValueCountFrequency (%)
16805
97.7%
( 136
 
0.8%
) 136
 
0.8%
' 109
 
0.6%
- 9
 
0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 453233
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 55137
 
12.2%
e 43176
 
9.5%
i 40575
 
9.0%
t 40485
 
8.9%
n 36881
 
8.1%
d 21302
 
4.7%
r 20390
 
4.5%
s 18355
 
4.0%
16805
 
3.7%
o 14461
 
3.2%
Other values (44) 145666
32.1%

Postal Code
Real number (ℝ)

High correlation  Missing 

Distinct631
Distinct (%)6.3%
Missing41296
Missing (%)80.5%
Infinite0
Infinite (%)0.0%
Mean55190.379
Minimum1040
Maximum99301
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size400.8 KiB
2025-09-25T21:00:26.202407image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Quantile statistics

Minimum1040
5-th percentile10009
Q123223
median56430.5
Q390008
95-th percentile98006
Maximum99301
Range98261
Interquartile range (IQR)66785

Descriptive statistics

Standard deviation32063.693
Coefficient of variation (CV)0.58096526
Kurtosis-1.4930202
Mean55190.379
Median Absolute Deviation (MAD)33573.5
Skewness-0.12852552
Sum5.5157265 × 108
Variance1.0280804 × 109
MonotonicityNot monotonic
2025-09-25T21:00:26.327205image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
10035 263
 
0.5%
10024 230
 
0.4%
10009 229
 
0.4%
94122 203
 
0.4%
10011 193
 
0.4%
94110 166
 
0.3%
98105 165
 
0.3%
19134 160
 
0.3%
98103 151
 
0.3%
90049 151
 
0.3%
Other values (621) 8083
 
15.8%
(Missing) 41296
80.5%
ValueCountFrequency (%)
1040 1
 
< 0.1%
1453 6
 
< 0.1%
1752 2
 
< 0.1%
1810 4
 
< 0.1%
1841 33
0.1%
1852 16
< 0.1%
1915 3
 
< 0.1%
2038 17
< 0.1%
2138 6
 
< 0.1%
2148 3
 
< 0.1%
ValueCountFrequency (%)
99301 6
 
< 0.1%
99207 7
 
< 0.1%
98661 5
 
< 0.1%
98632 3
 
< 0.1%
98502 5
 
< 0.1%
98270 2
 
< 0.1%
98226 3
 
< 0.1%
98208 1
 
< 0.1%
98198 7
 
< 0.1%
98115 112
0.2%

Market
Categorical

High correlation 

Distinct7
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size400.8 KiB
APAC
11002 
LATAM
10294 
EU
10000 
US
9994 
EMEA
5029 
Other values (2)
4971 

Length

Max length6
Median length5
Mean length3.6148957
Min length2

Characters and Unicode

Total characters185408
Distinct characters16
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowUS
2nd rowAPAC
3rd rowAPAC
4th rowEU
5th rowAfrica

Common Values

ValueCountFrequency (%)
APAC 11002
21.5%
LATAM 10294
20.1%
EU 10000
19.5%
US 9994
19.5%
EMEA 5029
9.8%
Africa 4587
8.9%
Canada 384
 
0.7%

Length

2025-09-25T21:00:26.429747image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2025-09-25T21:00:26.519967image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
ValueCountFrequency (%)
apac 11002
21.5%
latam 10294
20.1%
eu 10000
19.5%
us 9994
19.5%
emea 5029
9.8%
africa 4587
8.9%
canada 384
 
0.7%

Most occurring characters

ValueCountFrequency (%)
A 52208
28.2%
E 20058
 
10.8%
U 19994
 
10.8%
M 15323
 
8.3%
C 11386
 
6.1%
P 11002
 
5.9%
L 10294
 
5.6%
T 10294
 
5.6%
S 9994
 
5.4%
a 5739
 
3.1%
Other values (6) 19116
 
10.3%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 160553
86.6%
Lowercase Letter 24855
 
13.4%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
A 52208
32.5%
E 20058
 
12.5%
U 19994
 
12.5%
M 15323
 
9.5%
C 11386
 
7.1%
P 11002
 
6.9%
L 10294
 
6.4%
T 10294
 
6.4%
S 9994
 
6.2%
Lowercase Letter
ValueCountFrequency (%)
a 5739
23.1%
r 4587
18.5%
f 4587
18.5%
i 4587
18.5%
c 4587
18.5%
n 384
 
1.5%
d 384
 
1.5%

Most occurring scripts

ValueCountFrequency (%)
Latin 185408
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
A 52208
28.2%
E 20058
 
10.8%
U 19994
 
10.8%
M 15323
 
8.3%
C 11386
 
6.1%
P 11002
 
5.9%
L 10294
 
5.6%
T 10294
 
5.6%
S 9994
 
5.4%
a 5739
 
3.1%
Other values (6) 19116
 
10.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 185408
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
A 52208
28.2%
E 20058
 
10.8%
U 19994
 
10.8%
M 15323
 
8.3%
C 11386
 
6.1%
P 11002
 
5.9%
L 10294
 
5.6%
T 10294
 
5.6%
S 9994
 
5.4%
a 5739
 
3.1%
Other values (6) 19116
 
10.3%

Region
Categorical

High correlation 

Distinct13
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size400.8 KiB
Central
11117 
South
6645 
EMEA
5029 
North
4785 
Africa
4587 
Other values (8)
19127 

Length

Max length14
Median length12
Mean length6.638643
Min length4

Characters and Unicode

Total characters340496
Distinct characters24
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowEast
2nd rowOceania
3rd rowOceania
4th rowCentral
5th rowAfrica

Common Values

ValueCountFrequency (%)
Central 11117
21.7%
South 6645
13.0%
EMEA 5029
9.8%
North 4785
9.3%
Africa 4587
8.9%
Oceania 3487
 
6.8%
West 3203
 
6.2%
Southeast Asia 3129
 
6.1%
East 2848
 
5.6%
North Asia 2338
 
4.6%
Other values (3) 4122
 
8.0%

Length

2025-09-25T21:00:26.650256image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
central 13165
22.4%
asia 7515
12.8%
north 7123
12.1%
south 6645
11.3%
emea 5029
 
8.6%
africa 4587
 
7.8%
oceania 3487
 
5.9%
west 3203
 
5.4%
southeast 3129
 
5.3%
east 2848
 
4.8%
Other values (2) 2074
 
3.5%

Most occurring characters

ValueCountFrequency (%)
a 42750
12.6%
t 39242
 
11.5%
r 26565
 
7.8%
e 24674
 
7.2%
n 18726
 
5.5%
i 17279
 
5.1%
A 17131
 
5.0%
h 16897
 
5.0%
o 16897
 
5.0%
s 16695
 
4.9%
Other values (14) 103640
30.4%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 259089
76.1%
Uppercase Letter 73892
 
21.7%
Space Separator 7515
 
2.2%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 42750
16.5%
t 39242
15.1%
r 26565
10.3%
e 24674
9.5%
n 18726
7.2%
i 17279
6.7%
h 16897
 
6.5%
o 16897
 
6.5%
s 16695
 
6.4%
l 13165
 
5.1%
Other values (5) 26199
10.1%
Uppercase Letter
ValueCountFrequency (%)
A 17131
23.2%
C 15239
20.6%
E 12906
17.5%
S 9774
13.2%
N 7123
9.6%
M 5029
 
6.8%
O 3487
 
4.7%
W 3203
 
4.3%
Space Separator
ValueCountFrequency (%)
7515
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 332981
97.8%
Common 7515
 
2.2%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 42750
12.8%
t 39242
11.8%
r 26565
 
8.0%
e 24674
 
7.4%
n 18726
 
5.6%
i 17279
 
5.2%
A 17131
 
5.1%
h 16897
 
5.1%
o 16897
 
5.1%
s 16695
 
5.0%
Other values (13) 96125
28.9%
Common
ValueCountFrequency (%)
7515
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 340496
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 42750
12.6%
t 39242
 
11.5%
r 26565
 
7.8%
e 24674
 
7.2%
n 18726
 
5.5%
i 17279
 
5.1%
A 17131
 
5.0%
h 16897
 
5.0%
o 16897
 
5.0%
s 16695
 
4.9%
Other values (14) 103640
30.4%
Distinct10292
Distinct (%)20.1%
Missing0
Missing (%)0.0%
Memory size400.8 KiB
2025-09-25T21:00:26.943529image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Length

Max length16
Median length15
Mean length15.195009
Min length15

Characters and Unicode

Total characters779352
Distinct characters35
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1420 ?
Unique (%)2.8%

Sample

1st rowTEC-AC-10003033
2nd rowFUR-CH-10003950
3rd rowTEC-PH-10004664
4th rowTEC-PH-10004583
5th rowTEC-SHA-10000501
ValueCountFrequency (%)
tec-hp 83
 
0.2%
off-ar-10003651 35
 
0.1%
off-ar-10003829 31
 
0.1%
off-bi-10002799 30
 
0.1%
off-bi-10003708 30
 
0.1%
fur-ch-10003354 28
 
0.1%
off-bi-10002570 27
 
0.1%
off-bi-10004140 25
 
< 0.1%
off-bi-10001808 24
 
< 0.1%
off-bi-10004632 24
 
< 0.1%
Other values (10283) 51036
99.3%
2025-09-25T21:00:27.397109image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 179614
23.0%
- 102580
13.2%
F 77961
10.0%
1 77220
9.9%
O 37143
 
4.8%
2 25593
 
3.3%
3 25555
 
3.3%
4 25148
 
3.2%
A 20235
 
2.6%
C 19282
 
2.5%
Other values (25) 189021
24.3%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 410320
52.6%
Uppercase Letter 266369
34.2%
Dash Punctuation 102580
 
13.2%
Space Separator 83
 
< 0.1%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
F 77961
29.3%
O 37143
13.9%
A 20235
 
7.6%
C 19282
 
7.2%
T 16042
 
6.0%
E 15297
 
5.7%
U 14862
 
5.6%
R 14680
 
5.5%
S 8467
 
3.2%
B 8234
 
3.1%
Other values (13) 34166
12.8%
Decimal Number
ValueCountFrequency (%)
0 179614
43.8%
1 77220
18.8%
2 25593
 
6.2%
3 25555
 
6.2%
4 25148
 
6.1%
5 16086
 
3.9%
7 15726
 
3.8%
8 15335
 
3.7%
9 15242
 
3.7%
6 14801
 
3.6%
Dash Punctuation
ValueCountFrequency (%)
- 102580
100.0%
Space Separator
ValueCountFrequency (%)
83
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 512983
65.8%
Latin 266369
34.2%

Most frequent character per script

Latin
ValueCountFrequency (%)
F 77961
29.3%
O 37143
13.9%
A 20235
 
7.6%
C 19282
 
7.2%
T 16042
 
6.0%
E 15297
 
5.7%
U 14862
 
5.6%
R 14680
 
5.5%
S 8467
 
3.2%
B 8234
 
3.1%
Other values (13) 34166
12.8%
Common
ValueCountFrequency (%)
0 179614
35.0%
- 102580
20.0%
1 77220
15.1%
2 25593
 
5.0%
3 25555
 
5.0%
4 25148
 
4.9%
5 16086
 
3.1%
7 15726
 
3.1%
8 15335
 
3.0%
9 15242
 
3.0%
Other values (2) 14884
 
2.9%

Most occurring blocks

ValueCountFrequency (%)
ASCII 779352
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 179614
23.0%
- 102580
13.2%
F 77961
10.0%
1 77220
9.9%
O 37143
 
4.8%
2 25593
 
3.3%
3 25555
 
3.3%
4 25148
 
3.2%
A 20235
 
2.6%
C 19282
 
2.5%
Other values (25) 189021
24.3%

Category
Categorical

High correlation 

Distinct3
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size400.8 KiB
Office Supplies
31273 
Technology
10141 
Furniture
9876 

Length

Max length15
Median length15
Mean length12.856093
Min length9

Characters and Unicode

Total characters659389
Distinct characters20
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowTechnology
2nd rowFurniture
3rd rowTechnology
4th rowTechnology
5th rowTechnology

Common Values

ValueCountFrequency (%)
Office Supplies 31273
61.0%
Technology 10141
 
19.8%
Furniture 9876
 
19.3%

Length

2025-09-25T21:00:27.509484image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2025-09-25T21:00:27.584692image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
ValueCountFrequency (%)
office 31273
37.9%
supplies 31273
37.9%
technology 10141
 
12.3%
furniture 9876
 
12.0%

Most occurring characters

ValueCountFrequency (%)
e 82563
12.5%
i 72422
11.0%
p 62546
9.5%
f 62546
9.5%
u 51025
 
7.7%
c 41414
 
6.3%
l 41414
 
6.3%
O 31273
 
4.7%
S 31273
 
4.7%
31273
 
4.7%
Other values (10) 151640
23.0%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 545553
82.7%
Uppercase Letter 82563
 
12.5%
Space Separator 31273
 
4.7%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 82563
15.1%
i 72422
13.3%
p 62546
11.5%
f 62546
11.5%
u 51025
9.4%
c 41414
7.6%
l 41414
7.6%
s 31273
 
5.7%
o 20282
 
3.7%
n 20017
 
3.7%
Other values (5) 60051
11.0%
Uppercase Letter
ValueCountFrequency (%)
O 31273
37.9%
S 31273
37.9%
T 10141
 
12.3%
F 9876
 
12.0%
Space Separator
ValueCountFrequency (%)
31273
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 628116
95.3%
Common 31273
 
4.7%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 82563
13.1%
i 72422
11.5%
p 62546
10.0%
f 62546
10.0%
u 51025
8.1%
c 41414
 
6.6%
l 41414
 
6.6%
O 31273
 
5.0%
S 31273
 
5.0%
s 31273
 
5.0%
Other values (9) 120367
19.2%
Common
ValueCountFrequency (%)
31273
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 659389
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e 82563
12.5%
i 72422
11.0%
p 62546
9.5%
f 62546
9.5%
u 51025
 
7.7%
c 41414
 
6.3%
l 41414
 
6.3%
O 31273
 
4.7%
S 31273
 
4.7%
31273
 
4.7%
Other values (10) 151640
23.0%

Sub-Category
Categorical

High correlation 

Distinct17
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size400.8 KiB
Binders
6152 
Storage
5059 
Art
4883 
Paper
3538 
Chairs
3434 
Other values (12)
28224 

Length

Max length11
Median length9
Mean length7.2304933
Min length3

Characters and Unicode

Total characters370852
Distinct characters28
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowAccessories
2nd rowChairs
3rd rowPhones
4th rowPhones
5th rowCopiers

Common Values

ValueCountFrequency (%)
Binders 6152
12.0%
Storage 5059
 
9.9%
Art 4883
 
9.5%
Paper 3538
 
6.9%
Chairs 3434
 
6.7%
Phones 3357
 
6.5%
Furnishings 3170
 
6.2%
Accessories 3075
 
6.0%
Labels 2606
 
5.1%
Envelopes 2435
 
4.7%
Other values (7) 13581
26.5%

Length

2025-09-25T21:00:27.702402image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
binders 6152
12.0%
storage 5059
 
9.9%
art 4883
 
9.5%
paper 3538
 
6.9%
chairs 3434
 
6.7%
phones 3357
 
6.5%
furnishings 3170
 
6.2%
accessories 3075
 
6.0%
labels 2606
 
5.1%
envelopes 2435
 
4.7%
Other values (7) 13581
26.5%

Most occurring characters

ValueCountFrequency (%)
s 51961
14.0%
e 47733
12.9%
r 33954
 
9.2%
i 26890
 
7.3%
n 23945
 
6.5%
a 23570
 
6.4%
o 20971
 
5.7%
p 16556
 
4.5%
t 12362
 
3.3%
c 11802
 
3.2%
Other values (18) 101108
27.3%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 319562
86.2%
Uppercase Letter 51290
 
13.8%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
s 51961
16.3%
e 47733
14.9%
r 33954
10.6%
i 26890
8.4%
n 23945
7.5%
a 23570
7.4%
o 20971
6.6%
p 16556
 
5.2%
t 12362
 
3.9%
c 11802
 
3.7%
Other values (8) 49818
15.6%
Uppercase Letter
ValueCountFrequency (%)
A 9713
18.9%
B 8563
16.7%
S 7484
14.6%
P 6895
13.4%
C 5657
11.0%
F 5590
10.9%
L 2606
 
5.1%
E 2435
 
4.7%
M 1486
 
2.9%
T 861
 
1.7%

Most occurring scripts

ValueCountFrequency (%)
Latin 370852
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
s 51961
14.0%
e 47733
12.9%
r 33954
 
9.2%
i 26890
 
7.3%
n 23945
 
6.5%
a 23570
 
6.4%
o 20971
 
5.7%
p 16556
 
4.5%
t 12362
 
3.3%
c 11802
 
3.2%
Other values (18) 101108
27.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 370852
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
s 51961
14.0%
e 47733
12.9%
r 33954
 
9.2%
i 26890
 
7.3%
n 23945
 
6.5%
a 23570
 
6.4%
o 20971
 
5.7%
p 16556
 
4.5%
t 12362
 
3.3%
c 11802
 
3.2%
Other values (18) 101108
27.3%
Distinct3788
Distinct (%)7.4%
Missing0
Missing (%)0.0%
Memory size400.8 KiB
2025-09-25T21:00:28.219984image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Length

Max length127
Median length89
Mean length30.856931
Min length5

Characters and Unicode

Total characters1582652
Distinct characters85
Distinct categories11 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique98 ?
Unique (%)0.2%

Sample

1st rowPlantronics CS510 - Over-the-Head monaural Wireless Headset System
2nd rowNovimex Executive Leather Armchair, Black
3rd rowNokia Smart Phone, with Caller ID
4th rowMotorola Smart Phone, Cordless
5th rowSharp Wireless Fax, High-Speed
ValueCountFrequency (%)
labels 2385
 
1.0%
recycled 2291
 
1.0%
color 2187
 
0.9%
with 2177
 
0.9%
set 2106
 
0.9%
blue 2092
 
0.9%
durable 2072
 
0.9%
black 2055
 
0.9%
avery 1920
 
0.8%
clear 1893
 
0.8%
Other values (2826) 210320
90.9%
2025-09-25T21:00:28.839700image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
179838
 
11.4%
e 154618
 
9.8%
a 94421
 
6.0%
r 91563
 
5.8%
o 88370
 
5.6%
l 79902
 
5.0%
i 79392
 
5.0%
n 68089
 
4.3%
t 62491
 
3.9%
s 60638
 
3.8%
Other values (75) 623330
39.4%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 1084788
68.5%
Uppercase Letter 235084
 
14.9%
Space Separator 180265
 
11.4%
Other Punctuation 50142
 
3.2%
Decimal Number 25561
 
1.6%
Dash Punctuation 6566
 
0.4%
Control 86
 
< 0.1%
Close Punctuation 60
 
< 0.1%
Open Punctuation 60
 
< 0.1%
Math Symbol 35
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 154618
14.3%
a 94421
 
8.7%
r 91563
 
8.4%
o 88370
 
8.1%
l 79902
 
7.4%
i 79392
 
7.3%
n 68089
 
6.3%
t 62491
 
5.8%
s 60638
 
5.6%
c 43284
 
4.0%
Other values (18) 262020
24.2%
Uppercase Letter
ValueCountFrequency (%)
S 33233
14.1%
C 27670
11.8%
B 22724
 
9.7%
P 18037
 
7.7%
E 12943
 
5.5%
A 12468
 
5.3%
F 12209
 
5.2%
M 10589
 
4.5%
R 10364
 
4.4%
T 10107
 
4.3%
Other values (16) 64740
27.5%
Other Punctuation
ValueCountFrequency (%)
, 44416
88.6%
/ 1561
 
3.1%
& 1446
 
2.9%
" 1300
 
2.6%
. 998
 
2.0%
' 257
 
0.5%
# 90
 
0.2%
% 45
 
0.1%
* 9
 
< 0.1%
! 9
 
< 0.1%
Other values (2) 11
 
< 0.1%
Decimal Number
ValueCountFrequency (%)
1 5377
21.0%
0 5118
20.0%
5 3094
12.1%
2 2756
10.8%
3 2628
10.3%
8 1808
 
7.1%
4 1725
 
6.7%
9 1234
 
4.8%
6 941
 
3.7%
7 880
 
3.4%
Space Separator
ValueCountFrequency (%)
179838
99.8%
  427
 
0.2%
Control
ValueCountFrequency (%)
” 67
77.9%
“ 19
 
22.1%
Dash Punctuation
ValueCountFrequency (%)
- 6566
100.0%
Close Punctuation
ValueCountFrequency (%)
) 60
100.0%
Open Punctuation
ValueCountFrequency (%)
( 60
100.0%
Math Symbol
ValueCountFrequency (%)
+ 35
100.0%
Other Number
ValueCountFrequency (%)
¾ 5
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 1319872
83.4%
Common 262780
 
16.6%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 154618
 
11.7%
a 94421
 
7.2%
r 91563
 
6.9%
o 88370
 
6.7%
l 79902
 
6.1%
i 79392
 
6.0%
n 68089
 
5.2%
t 62491
 
4.7%
s 60638
 
4.6%
c 43284
 
3.3%
Other values (44) 497104
37.7%
Common
ValueCountFrequency (%)
179838
68.4%
, 44416
 
16.9%
- 6566
 
2.5%
1 5377
 
2.0%
0 5118
 
1.9%
5 3094
 
1.2%
2 2756
 
1.0%
3 2628
 
1.0%
8 1808
 
0.7%
4 1725
 
0.7%
Other values (21) 9454
 
3.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1582117
> 99.9%
None 535
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
179838
 
11.4%
e 154618
 
9.8%
a 94421
 
6.0%
r 91563
 
5.8%
o 88370
 
5.6%
l 79902
 
5.1%
i 79392
 
5.0%
n 68089
 
4.3%
t 62491
 
3.9%
s 60638
 
3.8%
Other values (69) 622795
39.4%
None
ValueCountFrequency (%)
  427
79.8%
” 67
 
12.5%
“ 19
 
3.6%
é 14
 
2.6%
¾ 5
 
0.9%
à 3
 
0.6%

Sales
Real number (ℝ)

High correlation 

Distinct22995
Distinct (%)44.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean246.49058
Minimum0.444
Maximum22638.48
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size400.8 KiB
2025-09-25T21:00:28.924313image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Quantile statistics

Minimum0.444
5-th percentile8.8
Q130.758625
median85.053
Q3251.0532
95-th percentile1015.9556
Maximum22638.48
Range22638.036
Interquartile range (IQR)220.29458

Descriptive statistics

Standard deviation487.56536
Coefficient of variation (CV)1.9780284
Kurtosis176.7312
Mean246.49058
Median Absolute Deviation (MAD)67.0062
Skewness8.13808
Sum12642502
Variance237719.98
MonotonicityNot monotonic
2025-09-25T21:00:29.007843image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
12.96 66
 
0.1%
25.92 50
 
0.1%
19.44 43
 
0.1%
32.4 42
 
0.1%
15.552 41
 
0.1%
10.368 38
 
0.1%
26.88 36
 
0.1%
24 36
 
0.1%
26.4 35
 
0.1%
17.52 31
 
0.1%
Other values (22985) 50872
99.2%
ValueCountFrequency (%)
0.444 1
 
< 0.1%
0.556 1
 
< 0.1%
0.836 1
 
< 0.1%
0.852 1
 
< 0.1%
0.876 1
 
< 0.1%
0.898 1
 
< 0.1%
0.984 1
 
< 0.1%
0.99 1
 
< 0.1%
1.044 1
 
< 0.1%
1.08 3
< 0.1%
ValueCountFrequency (%)
22638.48 1
< 0.1%
17499.95 1
< 0.1%
13999.96 1
< 0.1%
11199.968 1
< 0.1%
10499.97 1
< 0.1%
9892.74 1
< 0.1%
9449.95 1
< 0.1%
9099.93 1
< 0.1%
8749.95 1
< 0.1%
8399.976 1
< 0.1%

Quantity
Real number (ℝ)

Distinct14
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean3.4765451
Minimum1
Maximum14
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size400.8 KiB
2025-09-25T21:00:29.075043image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q12
median3
Q35
95-th percentile8
Maximum14
Range13
Interquartile range (IQR)3

Descriptive statistics

Standard deviation2.2787663
Coefficient of variation (CV)0.65546864
Kurtosis2.2758887
Mean3.4765451
Median Absolute Deviation (MAD)1
Skewness1.3603677
Sum178312
Variance5.1927759
MonotonicityNot monotonic
2025-09-25T21:00:29.138428image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram with fixed size bins (bins=14)
ValueCountFrequency (%)
2 12748
24.9%
3 9682
18.9%
1 8963
17.5%
4 6385
12.4%
5 4882
 
9.5%
6 3020
 
5.9%
7 2385
 
4.7%
8 1361
 
2.7%
9 987
 
1.9%
10 276
 
0.5%
Other values (4) 601
 
1.2%
ValueCountFrequency (%)
1 8963
17.5%
2 12748
24.9%
3 9682
18.9%
4 6385
12.4%
5 4882
 
9.5%
6 3020
 
5.9%
7 2385
 
4.7%
8 1361
 
2.7%
9 987
 
1.9%
10 276
 
0.5%
ValueCountFrequency (%)
14 186
 
0.4%
13 83
 
0.2%
12 176
 
0.3%
11 156
 
0.3%
10 276
 
0.5%
9 987
 
1.9%
8 1361
 
2.7%
7 2385
4.7%
6 3020
5.9%
5 4882
9.5%

Discount
Real number (ℝ)

High correlation  Zeros 

Distinct27
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.14290755
Minimum0
Maximum0.85
Zeros29009
Zeros (%)56.6%
Negative0
Negative (%)0.0%
Memory size400.8 KiB
2025-09-25T21:00:29.209842image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30.2
95-th percentile0.6
Maximum0.85
Range0.85
Interquartile range (IQR)0.2

Descriptive statistics

Standard deviation0.21227993
Coefficient of variation (CV)1.4854354
Kurtosis0.71668241
Mean0.14290755
Median Absolute Deviation (MAD)0
Skewness1.3877746
Sum7329.728
Variance0.045062769
MonotonicityNot monotonic
2025-09-25T21:00:29.281176image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram with fixed size bins (bins=27)
ValueCountFrequency (%)
0 29009
56.6%
0.2 4998
 
9.7%
0.1 4068
 
7.9%
0.4 3177
 
6.2%
0.6 2006
 
3.9%
0.7 1786
 
3.5%
0.5 1633
 
3.2%
0.17 735
 
1.4%
0.47 725
 
1.4%
0.15 541
 
1.1%
Other values (17) 2612
 
5.1%
ValueCountFrequency (%)
0 29009
56.6%
0.002 461
 
0.9%
0.07 150
 
0.3%
0.1 4068
 
7.9%
0.15 541
 
1.1%
0.17 735
 
1.4%
0.2 4998
 
9.7%
0.202 41
 
0.1%
0.25 198
 
0.4%
0.27 388
 
0.8%
ValueCountFrequency (%)
0.85 2
 
< 0.1%
0.8 316
 
0.6%
0.7 1786
3.5%
0.65 17
 
< 0.1%
0.602 23
 
< 0.1%
0.6 2006
3.9%
0.57 12
 
< 0.1%
0.55 10
 
< 0.1%
0.5 1633
3.2%
0.47 725
 
1.4%

Profit
Real number (ℝ)

High correlation  Zeros 

Distinct24575
Distinct (%)47.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean28.610982
Minimum-6599.978
Maximum8399.976
Zeros668
Zeros (%)1.3%
Negative12544
Negative (%)24.5%
Memory size400.8 KiB
2025-09-25T21:00:29.367178image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Quantile statistics

Minimum-6599.978
5-th percentile-83.90475
Q10
median9.24
Q336.81
95-th percentile211.5
Maximum8399.976
Range14999.954
Interquartile range (IQR)36.81

Descriptive statistics

Standard deviation174.34097
Coefficient of variation (CV)6.0934983
Kurtosis291.41109
Mean28.610982
Median Absolute Deviation (MAD)15.96
Skewness4.1571885
Sum1467457.3
Variance30394.774
MonotonicityNot monotonic
2025-09-25T21:00:29.461319image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 668
 
1.3%
4.32 70
 
0.1%
3.96 69
 
0.1%
7.92 67
 
0.1%
2.64 63
 
0.1%
2.88 60
 
0.1%
6.84 57
 
0.1%
9 56
 
0.1%
0.48 55
 
0.1%
3.42 55
 
0.1%
Other values (24565) 50070
97.6%
ValueCountFrequency (%)
-6599.978 1
< 0.1%
-4088.376 1
< 0.1%
-3839.9904 1
< 0.1%
-3701.8928 1
< 0.1%
-3399.98 1
< 0.1%
-3059.82 1
< 0.1%
-3009.435 1
< 0.1%
-2929.4845 1
< 0.1%
-2750.28 1
< 0.1%
-2639.9912 1
< 0.1%
ValueCountFrequency (%)
8399.976 1
< 0.1%
6719.9808 1
< 0.1%
5039.9856 1
< 0.1%
4946.37 1
< 0.1%
4630.4755 1
< 0.1%
3979.08 1
< 0.1%
3919.9888 1
< 0.1%
3177.475 1
< 0.1%
2939.31 1
< 0.1%
2817.99 1
< 0.1%

Shipping Cost
Real number (ℝ)

High correlation 

Distinct10037
Distinct (%)19.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean26.375915
Minimum0
Maximum933.57
Zeros2
Zeros (%)< 0.1%
Negative0
Negative (%)0.0%
Memory size400.8 KiB
2025-09-25T21:00:29.550144image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0.61
Q12.61
median7.79
Q324.45
95-th percentile111.4095
Maximum933.57
Range933.57
Interquartile range (IQR)21.84

Descriptive statistics

Standard deviation57.296804
Coefficient of variation (CV)2.1723153
Kurtosis50.020158
Mean26.375915
Median Absolute Deviation (MAD)6.41
Skewness5.8632264
Sum1352820.7
Variance3282.9237
MonotonicityDecreasing
2025-09-25T21:00:29.642285image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0.86 76
 
0.1%
0.71 75
 
0.1%
1.26 75
 
0.1%
1.36 74
 
0.1%
0.35 73
 
0.1%
0.94 71
 
0.1%
1.04 71
 
0.1%
0.79 71
 
0.1%
0.69 70
 
0.1%
1.3 70
 
0.1%
Other values (10027) 50564
98.6%
ValueCountFrequency (%)
0 2
 
< 0.1%
0.01 6
 
< 0.1%
0.02 9
 
< 0.1%
0.03 9
 
< 0.1%
0.04 14
< 0.1%
0.05 19
< 0.1%
0.06 18
< 0.1%
0.07 13
< 0.1%
0.08 17
< 0.1%
0.09 23
< 0.1%
ValueCountFrequency (%)
933.57 1
< 0.1%
923.63 1
< 0.1%
915.49 1
< 0.1%
910.16 1
< 0.1%
903.04 1
< 0.1%
897.35 1
< 0.1%
894.77 1
< 0.1%
878.38 1
< 0.1%
867.69 1
< 0.1%
865.74 1
< 0.1%

Order Priority
Categorical

Distinct4
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size400.8 KiB
Medium
29433 
High
15501 
Critical
3932 
Low
 
2424

Length

Max length8
Median length6
Mean length5.4070969
Min length3

Characters and Unicode

Total characters277330
Distinct characters18
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowCritical
2nd rowCritical
3rd rowMedium
4th rowMedium
5th rowCritical

Common Values

ValueCountFrequency (%)
Medium 29433
57.4%
High 15501
30.2%
Critical 3932
 
7.7%
Low 2424
 
4.7%

Length

2025-09-25T21:00:29.727544image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2025-09-25T21:00:29.781098image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
ValueCountFrequency (%)
medium 29433
57.4%
high 15501
30.2%
critical 3932
 
7.7%
low 2424
 
4.7%

Most occurring characters

ValueCountFrequency (%)
i 52798
19.0%
M 29433
10.6%
e 29433
10.6%
d 29433
10.6%
u 29433
10.6%
m 29433
10.6%
H 15501
 
5.6%
g 15501
 
5.6%
h 15501
 
5.6%
C 3932
 
1.4%
Other values (8) 26932
9.7%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 226040
81.5%
Uppercase Letter 51290
 
18.5%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
i 52798
23.4%
e 29433
13.0%
d 29433
13.0%
u 29433
13.0%
m 29433
13.0%
g 15501
 
6.9%
h 15501
 
6.9%
r 3932
 
1.7%
t 3932
 
1.7%
c 3932
 
1.7%
Other values (4) 12712
 
5.6%
Uppercase Letter
ValueCountFrequency (%)
M 29433
57.4%
H 15501
30.2%
C 3932
 
7.7%
L 2424
 
4.7%

Most occurring scripts

ValueCountFrequency (%)
Latin 277330
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
i 52798
19.0%
M 29433
10.6%
e 29433
10.6%
d 29433
10.6%
u 29433
10.6%
m 29433
10.6%
H 15501
 
5.6%
g 15501
 
5.6%
h 15501
 
5.6%
C 3932
 
1.4%
Other values (8) 26932
9.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII 277330
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
i 52798
19.0%
M 29433
10.6%
e 29433
10.6%
d 29433
10.6%
u 29433
10.6%
m 29433
10.6%
H 15501
 
5.6%
g 15501
 
5.6%
h 15501
 
5.6%
C 3932
 
1.4%
Other values (8) 26932
9.7%

Interactions

2025-09-25T21:00:17.734435image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-09-25T21:00:14.038141image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-09-25T21:00:14.686506image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-09-25T21:00:15.403373image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-09-25T21:00:15.955523image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-09-25T21:00:16.539752image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-09-25T21:00:17.104504image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-09-25T21:00:17.815581image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-09-25T21:00:14.123591image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-09-25T21:00:14.763149image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-09-25T21:00:15.479587image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-09-25T21:00:16.039310image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-09-25T21:00:16.619745image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-09-25T21:00:17.193255image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-09-25T21:00:17.901455image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-09-25T21:00:14.213383image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-09-25T21:00:14.840505image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-09-25T21:00:15.563111image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-09-25T21:00:16.122632image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-09-25T21:00:16.710275image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-09-25T21:00:17.281486image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-09-25T21:00:17.979951image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-09-25T21:00:14.305017image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-09-25T21:00:14.919699image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-09-25T21:00:15.637107image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-09-25T21:00:16.205674image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-09-25T21:00:16.788965image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-09-25T21:00:17.363699image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-09-25T21:00:18.066224image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-09-25T21:00:14.404881image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-09-25T21:00:15.003711image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-09-25T21:00:15.717973image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-09-25T21:00:16.292808image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-09-25T21:00:16.869157image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-09-25T21:00:17.459245image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-09-25T21:00:18.147426image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-09-25T21:00:14.501451image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-09-25T21:00:15.092813image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-09-25T21:00:15.792685image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-09-25T21:00:16.373624image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-09-25T21:00:16.945460image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-09-25T21:00:17.546162image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-09-25T21:00:18.236807image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-09-25T21:00:14.604789image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-09-25T21:00:15.176641image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-09-25T21:00:15.877635image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-09-25T21:00:16.460627image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-09-25T21:00:17.027617image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-09-25T21:00:17.638963image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Correlations

2025-09-25T21:00:29.839109image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
CategoryDiscountMarketOrder PriorityPostal CodeProfitQuantityRegionRow IDSalesSegmentShip ModeShipping CostSub-Category
Category1.0000.1850.0730.0030.0000.0560.0150.0560.0700.0620.0000.0000.1561.000
Discount0.1851.0000.3480.0110.053-0.5960.0180.3350.017-0.1000.0000.019-0.0940.159
Market0.0730.3481.0000.0131.0000.0090.1570.8820.7970.0210.0090.0140.0320.128
Order Priority0.0030.0110.0131.0000.0390.0090.0080.0270.0170.0000.0190.2840.0990.004
Postal Code0.0000.0531.0000.0391.000-0.0050.0140.9210.011-0.0020.0350.038-0.0050.000
Profit0.056-0.5960.0090.009-0.0051.0000.2010.010-0.0460.4900.0000.0110.4490.058
Quantity0.0150.0180.1570.0080.0140.2011.0000.129-0.2390.4160.0000.0080.3790.016
Region0.0560.3350.8820.0270.9210.0100.1291.0000.5320.0190.0120.0260.0280.069
Row ID0.0700.0170.7970.0170.011-0.046-0.2390.5321.000-0.1430.0180.012-0.1290.100
Sales0.062-0.1000.0210.000-0.0020.4900.4160.019-0.1431.0000.0000.0000.9130.061
Segment0.0000.0000.0090.0190.0350.0000.0000.0120.0180.0001.0000.0110.0000.012
Ship Mode0.0000.0190.0140.2840.0380.0110.0080.0260.0120.0000.0111.0000.0740.011
Shipping Cost0.156-0.0940.0320.099-0.0050.4490.3790.028-0.1290.9130.0000.0741.0000.112
Sub-Category1.0000.1590.1280.0040.0000.0580.0160.0690.1000.0610.0120.0110.1121.000

Missing values

2025-09-25T21:00:18.410631image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
A simple visualization of nullity by column.
2025-09-25T21:00:18.661958image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

Row IDOrder IDOrder DateShip DateShip ModeCustomer IDCustomer NameSegmentCityStateCountryPostal CodeMarketRegionProduct IDCategorySub-CategoryProduct NameSalesQuantityDiscountProfitShipping CostOrder Priority
032298CA-2012-12489131-07-201231-07-2012Same DayRH-19495Rick HansenConsumerNew York CityNew YorkUnited States10024.0USEastTEC-AC-10003033TechnologyAccessoriesPlantronics CS510 - Over-the-Head monaural Wireless Headset System2309.65070.0762.1845933.57Critical
126341IN-2013-7787805-02-201307-02-2013Second ClassJR-16210Justin RitterCorporateWollongongNew South WalesAustraliaNaNAPACOceaniaFUR-CH-10003950FurnitureChairsNovimex Executive Leather Armchair, Black3709.39590.1-288.7650923.63Critical
225330IN-2013-7124917-10-201318-10-2013First ClassCR-12730Craig ReiterConsumerBrisbaneQueenslandAustraliaNaNAPACOceaniaTEC-PH-10004664TechnologyPhonesNokia Smart Phone, with Caller ID5175.17190.1919.9710915.49Medium
313524ES-2013-157934228-01-201330-01-2013First ClassKM-16375Katherine MurrayHome OfficeBerlinBerlinGermanyNaNEUCentralTEC-PH-10004583TechnologyPhonesMotorola Smart Phone, Cordless2892.51050.1-96.5400910.16Medium
447221SG-2013-432005-11-201306-11-2013Same DayRH-9495Rick HansenConsumerDakarDakarSenegalNaNAfricaAfricaTEC-SHA-10000501TechnologyCopiersSharp Wireless Fax, High-Speed2832.96080.0311.5200903.04Critical
522732IN-2013-4236028-06-201301-07-2013Second ClassJM-15655Jim MitchumCorporateSydneyNew South WalesAustraliaNaNAPACOceaniaTEC-PH-10000030TechnologyPhonesSamsung Smart Phone, with Caller ID2862.67550.1763.2750897.35Critical
630570IN-2011-8182607-11-201109-11-2011First ClassTS-21340Toby SwindellConsumerPoriruaWellingtonNew ZealandNaNAPACOceaniaFUR-CH-10004050FurnitureChairsNovimex Executive Leather Armchair, Adjustable1822.08040.0564.8400894.77Critical
731192IN-2012-8636914-04-201218-04-2012Standard ClassMB-18085Mick BrownConsumerHamiltonWaikatoNew ZealandNaNAPACOceaniaFUR-TA-10002958FurnitureTablesChromcraft Conference Table, Fully Assembled5244.84060.0996.4800878.38High
840155CA-2014-13590914-10-201421-10-2014Standard ClassJW-15220Jane WacoCorporateSacramentoCaliforniaUnited States95823.0USWestOFF-BI-10003527Office SuppliesBindersFellowes PB500 Electric Punch Plastic Comb Binding Machine with Manual Bind5083.96050.21906.4850867.69Low
940936CA-2012-11663828-01-201231-01-2012Second ClassJH-15985Joseph HoltConsumerConcordNorth CarolinaUnited States28027.0USSouthFUR-TA-10000198FurnitureTablesChromcraft Bull-Nose Wood Oval Conference Tables & Bases4297.644130.4-1862.3124865.74Critical
Row IDOrder IDOrder DateShip DateShip ModeCustomer IDCustomer NameSegmentCityStateCountryPostal CodeMarketRegionProduct IDCategorySub-CategoryProduct NameSalesQuantityDiscountProfitShipping CostOrder Priority
5128046582TU-2014-673029-11-201430-11-2014First ClassKF-6285Karen FergusonHome OfficeMidyatMardinTurkeyNaNEMEAEMEAOFF-BOS-10000350Office SuppliesArtBoston Pens, Blue34.12860.6-49.57200.02Medium
512816039MX-2014-16953009-06-201411-06-2014First ClassHG-15025Hunter GlantzConsumerBragança PaulistaSão PauloBrazilNaNLATAMSouthOFF-PA-10002418Office SuppliesPaperGreen Bar Message Books, Multicolor84.00050.09.20000.02High
512829922MX-2012-10025828-12-201231-12-2012First ClassKM-16375Katherine MurrayHome OfficeManaguaManaguaNicaraguaNaNLATAMCentralOFF-PA-10004020Office SuppliesPaperSanDisk Message Books, 8.5 x 1118.64010.08.00000.01Medium
5128324105IN-2014-7232730-05-201430-05-2014Same DayKH-16330Katharine HarmsCorporateLucknowUttar PradeshIndiaNaNAPACCentral AsiaOFF-PA-10000215Office SuppliesPaperEaton Parchment Paper, Premium26.94020.01.86000.01High
5128424175IN-2014-5766205-08-201410-08-2014Standard ClassDB-13270Deborah BrumfieldHome OfficeTownsvilleQueenslandAustraliaNaNAPACOceaniaOFF-BI-10002424Office SuppliesBindersAvery Binder, Economy58.05050.119.95000.01Medium
5128529002IN-2014-6236619-06-201419-06-2014Same DayKE-16420Katrina EdelmanCorporateKureHiroshimaJapanNaNAPACNorth AsiaOFF-FA-10000746Office SuppliesFastenersAdvantus Thumb Tacks, 12 Pack65.10050.04.50000.01Medium
5128635398US-2014-10228820-06-201424-06-2014Standard ClassZC-21910Zuschuss CarrollConsumerHoustonTexasUnited States77095.0USCentralOFF-AP-10002906Office SuppliesAppliancesHoover Replacement Belt for Commercial Guardsman Heavy-Duty Upright Vacuum0.44410.8-1.11000.01Medium
5128740470US-2013-15576802-12-201302-12-2013Same DayLB-16795Laurel BeltranHome OfficeOxnardCaliforniaUnited States93030.0USWestOFF-EN-10001219Office SuppliesEnvelopes#10- 4 1/8" x 9 1/2" Security-Tint Envelopes22.92030.011.23080.01High
512889596MX-2012-14076718-02-201222-02-2012Standard ClassRB-19795Ross BairdHome OfficeValinhosSão PauloBrazilNaNLATAMSouthOFF-BI-10000806Office SuppliesBindersAcco Index Tab, Economy13.44020.02.40000.00Medium
512896147MX-2012-13446022-05-201226-05-2012Second ClassMC-18100Mick CrebaggaConsumerTipitapaManaguaNicaraguaNaNLATAMCentralOFF-PA-10004155Office SuppliesPaperEaton Computer Printout Paper, 8.5 x 1161.38030.01.80000.00High